On optimization methods for deep learning

نویسندگان

Quoc V. Le

Jiquan Ngiam

Adam Coates

Ahbik Lahiri

Bobby Prochnow

Andrew Y. Ng

چکیده

The predominant methodology in training deep learning advocates the use of stochastic gradient descent methods (SGDs). Despite its ease of implementation, SGDs are difficult to tune and parallelize. These problems make it challenging to develop, debug and scale up deep learning algorithms with SGDs. In this paper, we show that more sophisticated off-the-shelf optimization methods such as Limited memory BFGS (L-BFGS) and Conjugate gradient (CG) with line search can significantly simplify and speed up the process of pretraining deep algorithms. In our experiments, the difference between LBFGS/CG and SGDs are more pronounced if we consider algorithmic extensions (e.g., sparsity regularization) and hardware extensions (e.g., GPUs or computer clusters). Our experiments with distributed optimization support the use of L-BFGS with locally connected networks and convolutional neural networks. Using L-BFGS, our convolutional network model achieves 0.69% on the standard MNIST dataset. This is a state-of-theart result on MNIST among algorithms that do not use distortions or pretraining.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Hybrid Optimization Algorithm for Learning Deep Models

Deep learning is one of the subsets of machine learning that is widely used in Artificial Intelligence (AI) field such as natural language processing and machine vision. The learning algorithms require optimization in multiple aspects. Generally, model-based inferences need to solve an optimized problem. In deep learning, the most important problem that can be solved by optimization is neural n...

متن کامل

A Hybrid Optimization Algorithm for Learning Deep Models

متن کامل

Integration of Deep Learning Algorithms and Bilateral Filters with the Purpose of Building Extraction from Mono Optical Aerial Imagery

The problem of extracting the building from mono optical aerial imagery with high spatial resolution is always considered as an important challenge to prepare the maps. The goal of the current research is to take advantage of the semantic segmentation of mono optical aerial imagery to extract the building which is realized based on the combination of deep convolutional neural networks (DCNN) an...

متن کامل

Optimal mathematical operation of a hybrid microgrid in islanded mode for improving energy efficiency using deep learning and demand side management

Deep learning method is used to predict the future value of load demand. Based on obtained results, a new model based on the forward-backward load shifting and unnecessary load shedding is presented. As well, to increase energy efficiency, excess renewable energy has been used to produce green hydrogen. For this purpose, GAMS optimization software has been used for optimal operation of the micr...

متن کامل

Detecting Overlapping Communities in Social Networks using Deep Learning

In network analysis, a community is typically considered of as a group of nodes with a great density of edges among themselves and a low density of edges relative to other network parts. Detecting a community structure is important in any network analysis task, especially for revealing patterns between specified nodes. There is a variety of approaches presented in the literature for overlapping...

متن کامل

Optimization for Deep Learning Algorithms: A Review

In past few years, deep learning has received attention in the field of artificial intelligence. This paper reviews three focus areas of learning methods in deep learning namely supervised, unsupervised and reinforcement learning. These learning methods are used in implementing deep and convolutional neural networks. They offered unified computational approach, flexibility and scalability capab...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2011

On optimization methods for deep learning

نویسندگان

چکیده

منابع مشابه

A Hybrid Optimization Algorithm for Learning Deep Models

A Hybrid Optimization Algorithm for Learning Deep Models

Integration of Deep Learning Algorithms and Bilateral Filters with the Purpose of Building Extraction from Mono Optical Aerial Imagery

Optimal mathematical operation of a hybrid microgrid in islanded mode for improving energy efficiency using deep learning and demand side management

Detecting Overlapping Communities in Social Networks using Deep Learning

Optimization for Deep Learning Algorithms: A Review

عنوان ژورنال:

اشتراک گذاری